AITopics | descriptive feedback

Collaborating Authors

descriptive feedback

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

274e6fcf4a583de4a81c6376f17673e7-AuthorFeedback.pdf

Neural Information Processing SystemsOct-2-2025, 12:26:47 GMT

artificial intelligence, descriptive feedback, descriptive setup, (14 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence > Representation & Reasoning (0.72)

Add feedback

Reward Learning from Multiple Feedback Types

Metz, Yannick, Geiszl, András, Baur, Raphaël, El-Assady, Mennatallah

arXiv.org Artificial IntelligenceFeb-28-2025

Learning rewards from preference feedback has become an important tool in the alignment of agentic models. Preference-based feedback, often implemented as a binary comparison between multiple completions, is an established method to acquire large-scale human feedback. However, human feedback in other contexts is often much more diverse. Such diverse feedback can better support the goals of a human annotator, and the simultaneous use of multiple sources might be mutually informative for the learning process or carry type-dependent biases for the reward learning process. Despite these potential benefits, learning from different feedback types has yet to be explored extensively. In this paper, we bridge this gap by enabling experimentation and evaluating multi-type feedback in a broad set of environments. We present a process to generate high-quality simulated feedback of six different types. Then, we implement reward models and downstream RL training for all six feedback types. Based on the simulated feedback, we investigate the use of types of feedback across ten RL environments and compare them to pure preference-based baselines. We show empirically that diverse types of feedback can be utilized and lead to strong reward modeling performance. This work is the first strong indicator of the potential of multi-type feedback for RLHF.

artificial intelligence, machine learning, reinforcement learning, (20 more...)

arXiv.org Artificial Intelligence

2502.21038

Country: Europe > Switzerland > Zürich > Zürich (0.14)

Genre: Research Report (1.00)

Industry:

Education (0.46)
Energy > Oil & Gas > Upstream (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback

Learning Rewards from Linguistic Feedback

Sumers, Theodore R., Ho, Mark K., Hawkins, Robert D., Narasimhan, Karthik, Griffiths, Thomas L.

arXiv.org Artificial IntelligenceSep-30-2020

We explore unconstrained natural language feedback as a learning signal for artificial agents. Humans use rich and varied language to teach, yet most prior work on interactive learning from language assumes a particular form of input (e.g. commands). We propose a general framework which does not make this assumption. We decompose linguistic feedback into two components: a grounding to $\textit{features}$ of a Markov decision process and $\textit{sentiment}$ about those features. We then perform an analogue of inverse reinforcement learning, regressing the teacher's sentiment on the features to infer their latent reward function. To evaluate our approach, we first collect a corpus of teaching behavior in a cooperative task where both teacher and learner are human. We use our framework to implement two artificial learners: a simple "literal" model and a "pragmatic" model with additional inductive biases. We baseline these with a neural network trained end-to-end to predict latent rewards. We then repeat our initial experiment pairing human teachers with our models. We find our "literal" and "pragmatic" models successfully learn from live human feedback and offer statistically-significant performance gains over the end-to-end baseline, with the "pragmatic" model approaching human performance on the task. Inspection reveals the end-to-end network learns representations similar to our models, suggesting they reflect emergent properties of the data. Our work thus provides insight into the information structure of naturalistic linguistic feedback as well as methods to leverage it for reinforcement learning.

learner, machine learning, reinforcement learning, (16 more...)

arXiv.org Artificial Intelligence

2009.14715

Country:

North America > United States > New Jersey > Mercer County > Princeton (0.04)
North America > United States > New York > New York County > New York City (0.04)
North America > United States > Massachusetts > Suffolk County > Boston (0.04)
(3 more...)

Genre: Research Report (0.82)

Industry: Education > Educational Setting > Online (0.34)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.88)

Add feedback